Optimization of Features Parameters for HMM Phoneme Recognition of TIMIT Corpus
نویسندگان
چکیده
Phoneme is the smallest contrastive unit in the sound system of a language. Moreover, it has a meaningful role in speech recognition. In this study, we are interesting for phonemes recognition of Timit database using HTK toolkit for HMM. The main goal is to determine the optimal parameters for the recognizer. For this reason, different speech analysis techniques were operated such as Mel Frequency Cepstral Coefficient (MFCC), Linear Predictive Coding (LPC) and Perceptual Linear Prediction (PLP). These techniques were improved by adding temporal derivatives and energy to introduce temporal dynamic of parameters. Results revealed that MFCC and PLP techniques gave a reliable recognition rates using 39 coefficients. Keywords— Features extraction, HMM, HTK, LPC, MFCC, PLP, TIMIT
منابع مشابه
Phoneme Recognition using Hidden Markov Models: Evaluation with signal parameterization techniques
HMM applications show that they are an effective and powerful tool for modelling especially stochastic signals. For this reason, we use HMM for Timit phoneme recognition. The main goal is to study the performance of an HMM phoneme recognizer to fix on an optimal signal parameters. So, we apply different techniques of speech parameterization such as MFCC, LPCC and PLP. Then, we compare the recog...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملBidirectional LSTM Networks for Improved Phoneme Classification and Recognition
In this paper, we carry out two experiments on the TIMIT speech corpus with bidirectional and unidirectional Long Short Term Memory (LSTM) networks. In the first experiment (framewise phoneme classification) we find that bidirectional LSTM outperforms both unidirectional LSTM and conventional Recurrent Neural Networks (RNNs). In the second (phoneme recognition) we find that a hybrid BLSTM-HMM s...
متن کاملState-level variable modeling for
In HMM-based pattern recognition, the structure of the HMM is often predetermined according to some prior knowledge. In the recognition process, we usually make our judgment based on the maximum likelihood of the HMM, without considering the time-varying property of state-level variables, which unfortunately may lead to incorrect results. In this paper, we analyze the property of state-level va...
متن کاملSpeaker-independent phoneme alignment using transition-dependent states
Determining the location of phonemes is important to a number of speech applications, including training of automatic speech recognition systems, building text-to-speech systems, and research on human speech processing. Agreement of humans on the location of phonemes is, on average, 93.78% within 20 msec on a variety of corpora, and 93.49% within 20 msec on the TIMIT corpus. We describe a basel...
متن کامل